Comparing K Nearest Neighbours Methods and Linear Regression - Is There Reason To Select One Over the Other?

نویسندگان

  • Arto Haara
  • Annika S. Kangas
چکیده

Non-parametric k nearest neighbours (k-nn) techniques are increasingly used in forestry problems, especially in remote sensing. Parametric regression analysis has the advantage of well-known statistical theory behind it, whereas the statistical properties of k-nn are less studied. In this study, we compared the relative performance of k-nn and linear regression in an experiment. We examined the effect of three different properties of the data and problem: 1) the effect of increasing non-linearity of the modelling task, 2) the effect of the assumptions concerning the population and 3) the effect of balance of the sample data. In order to be able to determine the effect of these three aspects, we used simulated data and simple modelling problems. K-nn and linear regression gave fairly similar results with respect to the average RMSEs. In both cases, balanced modelling dataset gave better results than unbalanced dataset. When the results were examined within diameter classes, the k-nn results were less biased than regression model results, especially with extreme values of diameter. The differences increased with increasing non-linearity of the model and increasing unbalance of the data. The difference between the methods was more obvious when the assumed model form was not exactly correct.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Pseudo-Likelihood Inference Underestimates Model Uncertainty: Evidence from Bayesian Nearest Neighbours

When using the K-nearest neighbours (KNN) method, one often ignores the uncertainty in the choice of K. To account for such uncertainty, Bayesian KNN (BKNN) has been proposed and studied (Holmes and Adams 2002 Cucala et al. 2009). We present some evidence to show that the pseudo-likelihood approach for BKNN, even after being corrected by Cucala et al. (2009), still significantly underest...

متن کامل

Nearest Neighbour Based Forecast Model for PM10 Forecasting: Individual and Combination Forecasting

Air quality forecasting using nearest neighbour technique provides an alternative to statistical and neural network models, which needs the information on predictor variables and understanding of underlying patterns in the data. k-nearest neighbour method of forecasting that does not assume any linear or nonlinear form of the data is used in this study to obtain the next step forecast of PM10 c...

متن کامل

An Efficient Algorithm for Bayesian Nearest Neighbours

K-Nearest Neighbours (k-NN) is a popular classification and regression algorithm, yet one of its main limitations is the difficulty in choosing the number of neighbours. We present a Bayesian algorithm to compute the posterior probability distribution for k given a target point within a dataset, efficiently and without the use of Markov Chain Monte Carlo (MCMC) methods or simulation—alongside a...

متن کامل

Local Gaussian Processes for Pose Recognition from Noisy Inputs

Gaussian processes have been widely used as a method for inferring the pose of articulated bodies directly from image data. While able to model complex non-linear functions, they are limited due to their inability to model multi-modality caused by ambiguities and varying noise in the data set. For this reason techniques employing mixtures of local Gaussian processes have been proposed to allow ...

متن کامل

Non-linear Methods for Multivariate Statistical Calibration and Their Use in Palaeoecology: a Comparison of Inverse (k-nearest Neighbours, Pls and Wa-pls) and Classical Approaches

Citation: ter Braak, C. J. F. (1995). Non-linear methods for multivariate statistical calibration and their use in palaeoecology: a comparison of inverse (k-nearest neighbours, partial least squares and weighted averaging partial least squares) and classical approaches. Abstract Current environmental problems, such as acid rain and global warming, have greatly increased interest in fossil speci...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • MCFNS

دوره 4  شماره 

صفحات  -

تاریخ انتشار 2012